Installation

source("http://bioconductor.org/biocLite.R")
biocLite()
biocLite("EBImage")

Import/Export Image

Read image file

Read an image file as an 3D array of doubles ranging from 0 to 1. The third dimension is the slot for the three channels: Red, Green and Blue, or RGB.

library("EBImage")
img <- readImage("cat.jpg")
print(img)
## Image 
##   colorMode    : Color 
##   storage.mode : double 
##   dim          : 600 400 3 
##   frames.total : 3 
##   frames.render: 1 
## 
## imageData(object)[1:5,1:6,1]
##           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
## [1,] 0.1176471 0.1137255 0.1137255 0.1098039 0.1137255 0.1176471
## [2,] 0.1176471 0.1137255 0.1137255 0.1137255 0.1137255 0.1176471
## [3,] 0.1176471 0.1176471 0.1137255 0.1137255 0.1176471 0.1176471
## [4,] 0.1176471 0.1176471 0.1176471 0.1176471 0.1215686 0.1215686
## [5,] 0.1176471 0.1176471 0.1215686 0.1215686 0.1254902 0.1254902

Load images from arrays of pixel values ** Example: handwritten zipcode digits in greyscale of size \(16*16\)

load("zipcode.RData")
dat[1:5, 1:5]
##   V1 V2 V3     V4     V5
## 1 -1 -1 -1 -1.000 -1.000
## 2 -1 -1 -1 -1.000 -1.000
## 3 -1 -1 -1 -1.000 -1.000
## 4 -1 -1 -1 -0.929  0.351
## 5 -1 -1 -1 -1.000 -0.869

Display image

  • With EBImage:display image using an interactive JavaScript viewer (shown in browser) or R’s built-in graphics capabilities
display(img)
text(x = 20, y = 20, label = "Cat", adj = c(0,1), col = "orange", cex = 2)

  • To display in R graphics:
options("EBImage.display"= "raster") 
  • With default graphics package:
n_r <- n_c <- 16
vec <- dat[70,]
M <- matrix(vec, n_r, n_c)
MM <- M[,rev(1:ncol(M))]
par(mfrow=c(1,2))
image(x=1:n_r, y=1:n_c, z=M, axes = FALSE, xlab="", ylab="", col = grey(seq(0, 1, length = 256)))
image(x=1:n_r, y=1:n_c, z=MM, axes = FALSE, xlab="", ylab="", col = grey(seq(0, 1, length = 256)))

par(mfrow=c(1,1))
  • Conversion to Image object
img_zip <- Image(vec, dim=c(16, 16))
display(img_zip)

Image data representation

EBImage uses a package-specific class Image to store and process images. It extends the R base class array, and all EBImage functions can also be called directly on matrices and arrays.

str(img)
## Formal class 'Image' [package "EBImage"] with 2 slots
##   ..@ .Data    : num [1:600, 1:400, 1:3] 0.118 0.118 0.118 0.118 0.118 ...
##   ..@ colormode: int 2

Image data can be accessed as a plain R array using the imageData accessor,

dim(img)
## [1] 600 400   3
imageData(img)[1:3, 1:6,]
## , , 1
## 
##           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
## [1,] 0.1176471 0.1137255 0.1137255 0.1098039 0.1137255 0.1176471
## [2,] 0.1176471 0.1137255 0.1137255 0.1137255 0.1137255 0.1176471
## [3,] 0.1176471 0.1176471 0.1137255 0.1137255 0.1176471 0.1176471
## 
## , , 2
## 
##           [,1]      [,2]     [,3]      [,4]      [,5]      [,6]
## [1,] 0.1490196 0.1450980 0.145098 0.1411765 0.1450980 0.1490196
## [2,] 0.1490196 0.1450980 0.145098 0.1450980 0.1450980 0.1490196
## [3,] 0.1490196 0.1490196 0.145098 0.1450980 0.1490196 0.1490196
## 
## , , 3
## 
##           [,1]      [,2]      [,3]       [,4]      [,5]      [,6]
## [1,] 0.1058824 0.1019608 0.1019608 0.09803922 0.1019608 0.1058824
## [2,] 0.1058824 0.1019608 0.1019608 0.10196078 0.1019608 0.1058824
## [3,] 0.1058824 0.1058824 0.1019608 0.10196078 0.1058824 0.1058824

The distribution of pixel intensities can be plotted in a histogram, and their range inspected using the range function.

hist(img)

Basic Image Operations

Resize image

img_small <- resize(img, 128, 128)
display(img_small)

Display muptiple images

img_dog <- readImage("dog.jpg")
img_dog <- resize(img_dog, 128, 128)
img_all <- EBImage::combine(img_small, img_dog)
display(img_all, all=TRUE)

img_all2 <- EBImage::combine(img_small, flip(img_small), flop(img_small))
display(img_all2, all=TRUE)

Adjust brightness

img_bright <- img + 0.2
img_dark <- img - 0.2
display(combine(img_bright, img_dark), all=TRUE)

Adjust contrast

img_low <- img * 0.5
img_high <- img * 2
display(combine(img_low, img_high), all=TRUE)

Crop image

display(img[300:450, 50:200,])

Spatial Transformation

img_rotate <- translate(rotate(img, 45), c(50, 0))
display(img_rotate)

Color Management

Transformation to grayscale

  • Mode =“gray”/“grey”: converts a Color image into a Grayscale image, using uniform 1/3 RGB weights.
display(channel(img, mode='gray'))

* Mode = “luminance”: luminance-preserving Color to Grayscale conversion using CIE 1931 luminance weights: 0.2126 * R + 0.7152 * G + 0.0722 * B.

display(channel(img, mode='luminance'))

Transformation to other color channel

  • Mode= “asred”, “asgreen”, “asblue”: converts a Grayscale image or an array into a Color image of the specified hue.
display(channel(img_zip, mode='asred'))

Filtering

Linear Filtering

A common preprocessing step involves cleaning up the images by removing local artifacts or noise through smoothing. An intuitive approach is to define a window of a selected size around each pixel and average the values within that neighborhood. After applying this procedure to all pixels, the new, smoothed image is obtained. Mathematically, this can be expressed as \[f'(x,y)=\frac{1}{N} \sum_{s=-a}^{s=a} \sum_{t=-a}^{s=a} f(x+s, y+t) \] where \(f'(x,y)\) is the value of the pixel at position \((x,y)\), and aa determines the window size, which is \(2a+1\) in each direction. \(N=(2a+1)2\) is the number of pixels averaged over, and \(f′\) is the new, smoothed image.

More generally, we can replace the moving average by a weighted average, using a weight function ww, which typically has the highest value at the window midpoint (s=t=0s=t=0) and then decreases towards the edges. \[ (w∗f)(x,y)=\sum_{s=-\infty}^{s=\infty} w(s,t) f(x+s, y+s)\]

For notational convenience, we let the summations range from \(-\infty\)to \(\infty\), even if in practice the sums are finite and \(w\) has only a finite number of non-zero values. In fact, we can think of the weight function \(w\) as another image, and this operation is also called the convolution of the images \(f\) and \(w\), indicated by the the symbol \(∗\). Convolution is a linear operation in the sense that \(w∗(c_1f_1+c_2f_2)=c_1w∗f_1+c_2w∗f_2\) for any two images \(f_1\), \(f_2\) and numbers \(c_1\), \(c_2\).

In EBImage, the 2-dimensional convolution is implemented by the function filter2, and the auxiliary function makeBrush can be used to generate the weight function. In fact, filter2 does not directly perform the summation indicated in the equation above. Instead, it uses the Fast Fourier Transformation in a way that is mathematically equivalent but computationally more efficient.

w <- makeBrush(size = 31, shape = 'gaussian', sigma = 5)
plot(w[(nrow(w)+1)/2, ], ylab = "w", xlab = "", cex = 0.7)

img_flo <- filter2(img, w)
display(img_flo)

Here we have used a Gaussian filter of width 5 given by sigma. Other available filter shapes include “box” (default), “disc”, “diamond” and “line”, for some of which the kernel can be binary; see ?makeBrush for details.

If the filtered image contains multiple frames, the filter is applied to each frame separately. For convenience, images can be also smoothed using the wrapper function gblur which performs Gaussian smoothing with the filter size automatically adjusted to sigma.

In signal processing the operation of smoothing an image is referred to as low-pass filtering. High-pass filtering is the opposite operation which allows to detect edges and sharpen images. This can be done, for instance, using a Laplacian filter.

  • Low-pass linear filtering: to blur images, remove noise…
f_low <- makeBrush(21, shape='disc', step=FALSE)
display(f_low, title='Disc filter')

f_low <- f_low/sum(f_low)
img_lowPass <- filter2(img, filter=f_low)
display(img_lowPass, title='Filtered image')

  • High-pass Laplacian filtering: to detect edges, sharpen images
f_high <- matrix(1, nc=3, nr=3)
f_high[2,2] <- -8
img_highPass <- filter2(img, f_high)
display(img_highPass, title='Filtered image')

Median filter

Another approach to perform noise reduction is to apply a median filter, which is a non-linear technique as opposed to the low pass convolution filter described in the previous section. Median filtering is particularly effective in the case of speckle noise, and has the advantage of removing noise while preserving edges.

The local median filter works by scanning the image pixel by pixel, replacing each pixel by the median on of its neighbors inside a window of specified size. This filtering technique is provided in EBImage by the function medianFilter. We demonstrate its use by first corrupting the image with uniform noise, and reconstructing the original image by median filtering.

img_lena <- readImage("lena_noisy.png")
img_medianFlt_2 <- medianFilter(img_lena, size=2)
img_medianFlt_5 <- medianFilter(img_lena, size=5)
img_c <- combine(img_lena, img_medianFlt_2, img_medianFlt_5)
display(img_c, all=TRUE)

Segmentation

Adaptive thresholding

The idea of adaptive thresholding is that, compared to straightforward thresholding from the previous section, the threshold is allowed to be different in different regions of the image. In this way, one can anticipate spatial dependencies of the underlying background signal caused, for instance, by uneven illumination or by stray signal from nearby bright objects.

Adaptive thresholding works by comparing each pixel’s intensity to the background determined from a local neighbourhood. This can be achieved by comparing the image to its smoothed version, where the filtering window is bigger than the typical size of objects we want to capture.

The function thresh returns the binary image resulting from the comparison between an image and its filtered version with a rectangular window.

img_bengal <- readImage("Bengal_1.jpg")
display(img_bengal)

img_bengal_bw <- channel(img_bengal, mode="gray")
img_seg1 <- thresh(img_bengal_bw, w=60, h=60, offset=0.06) # {f = matrix(1, nc=2*w+1, nr=2*h+1) ; f=f/sum(f) ; x>(filter2(x, f)+offset)}
img_seg2 <- opening(img_seg1, kern=makeBrush(11, shape='disc'))
img_seg3 <- opening(img_seg1, kern=makeBrush(5, shape='disc'))
img_seg <- combine(img_seg1, img_seg2, img_seg3)
display(img_seg, all=TRUE)

Morphological operations

  • dilate applies the mask kern by positioning its center over every pixel of the image x, the output value of the pixel is the maximum value of x covered by the mask. In case of binary images this is equivalent of putting the mask over every background pixel, and setting it to foreground if any of the pixels covered by the mask is from the foreground.
  • erode applies the mask kern by positioning its center over every pixel of the image x, the output value of the pixel is the minimum value of x covered by the mask. In case of binary images this is equivalent of putting the mask over every foreground pixel, and setting it to background if any of the pixels covered by the mask is from the background.
  • opening is an erosion followed by a dilation and closing is a dilation followed by an erosion.
  • whiteTopHat returns the difference between the original image x and its opening by the structuring element kern.
  • blackTopHat subtracts the original image x from its closing by the structuring element kern.
  • selfComplementaryTopHat is the sum of the whiteTopHat and the blackTopHat, simplified the difference between the closing and the opening of the image.
  • makeBrush generates brushes of various sizes and shapes that can be used as structuring elements.

Outline Analysis

Oriented Contour

img_leaf <- readImage("leaf.png")
img_leaf <- resize(img_leaf, 128, 128)
img_leaf <- channel(img_leaf, mode="gray")
img_leaf1 <- thresh(img_leaf, w=50, h=50, offset=0.05)
display(combine(img_leaf,img_leaf1), all=TRUE)

oc <- ocontour(bwlabel(img_leaf1))
plot(oc[[1]], type='l')
points(oc[[1]], col=2)

print(head(oc[[1]]))
##      [,1] [,2]
## [1,]   61    2
## [2,]   61    3
## [3,]   60    4
## [4,]   59    4
## [5,]   58    5
## [6,]   57    6

However, for some complex scenes, extracting object outline is actually hard. For instance:

img_bengal <- readImage("Bengal_1.jpg")
img_bengal_bw <- channel(img_bengal, mode="gray")
img_seg1 <- thresh(img_bengal_bw, w=60, h=60, offset=0.06)
img_seg2 <- dilate(img_seg1, kern=makeBrush(25, shape='gaussian'))
img_fill <- fillHull(img_seg2)
oc_bengal <- ocontour(bwlabel(img_fill))
plot(oc_bengal[[1]], type="l")

Local Curvature

  • Local Curvature fits a local non-parametric smoothing line (polynomial of degree 2) at each point along the line segment, and computes the curvature locally using numerical derivatives.
lc <- localCurvature(x=oc[[1]], h=11)
print(head(lc$curvature))
## [1] 0.0958879 0.1066456 0.1094961 0.1158709 0.1530234 0.1541314
i <- lc$curvature >= 0
neg <- array(0, dim(img_leaf1))
pos <- neg
pos[lc$contour[i,]+1]  <- lc$curvature[i]
neg[lc$contour[!i,]+1] <- -lc$curvature[!i]
display(10*(rgbImage(pos, , neg)), title = "Image curvature")  

Shape Analysis with “Momocs”

The package “Momocs” intended to ease and popularize shape analysis of outlines (especially using elliptical Fourier analysis). It includes some of the most common approaches: traditional morphometrics, global descriptors, open outlines, closed outlines, configuration of landmarks and etc.. It is built with a class coo for handling morphometric datasets, i.e. lists of (x; y) outline coordinates.

library(Momocs)
## 
## Attaching package: 'Momocs'
## The following object is masked from 'package:utils':
## 
##     stack
coo_leaf <- Coo(oc[[1]])
coo.plot(coo_leaf[1]) #plot a simple wrapper for plotting shapes

coo.plot(coo.sample(coo_leaf[1], 64)) #samples n points in coo along the curvilinear abscissa.

coo.plot(coo.center(coo_leaf[1]), main="Centered Leaf") #coo.center centers the coo's centroid on the origin.

par(mar=rep(4,4))
coo.plot(coo_leaf[1], points=TRUE, main="Smoothing")
coo <- coo.sample(coo_leaf[1], 100)
s <- seq(1, 500, length=10)
cols <- col.summer(10)
for (i in seq(along=s)) {
  coo.draw(coo.smooth(coo, s[i]), col=NA, border=cols[i])
}

For more information of how to use Momocs package, please see the reference.

Color Feautres

The distribution of pixel colors can be informative in discriminating different objects. For instance:

Color Features from RGB

By discretizing the RGB values, we can get a set of color features. We subdivide the pixel values in each color channel into multiple bands of equal width. Then the counts of pixels in correponding bins consist of a set of color features that characterizes the color distribution of the image.

mat <- imageData(img)
nR <- 10
nG <- 8
nB <- 10
# Caution: the bins should be consistent across all images!
rBin <- seq(0, 1, length.out=nR)
gBin <- seq(0, 1, length.out=nG)
bBin <- seq(0, 1, length.out=nB)
freq_rgb <- as.data.frame(table(factor(findInterval(mat[,,1], rBin), levels=1:nR), 
                                factor(findInterval(mat[,,2], gBin), levels=1:nG), 
                                factor(findInterval(mat[,,3], bBin), levels=1:nB)))
rgb_feature <- as.numeric(freq_rgb$Freq)/(ncol(mat)*nrow(mat)) # normalization

The number of bins for each color channel (\(n_R, n_G, n_B\)) is a tuning parameter for constructing color features. The total number of features is \(n_R n_G n_B\).

Color Features From HSV

The Hue, Saturation, Value (HSV) model of color is closer to human perception of color, and thus is eaiser to interpret than RGB model. Using the same discretization, color features can be extracted from the histogram of HSV values.

library(grDevices)
# Convert 3d array of RGB to 2d matrix
mat_rgb <- mat
dim(mat_rgb) <- c(nrow(mat)*ncol(mat), 3)
mat_hsv <- rgb2hsv(t(mat_rgb))
nH <- 10
nS <- 6
nV <- 6
# Caution: determine the bins using all images! The bins should be consistent across all images. The following code is only used for demonstration on a single image.
hBin <- seq(0, 1, length.out=nH)
sBin <- seq(0, 1, length.out=nS)
vBin <- seq(0, 0.005, length.out=nV) 
freq_hsv <- as.data.frame(table(factor(findInterval(mat_hsv[1,], hBin), levels=1:nH), 
                                factor(findInterval(mat_hsv[2,], sBin), levels=1:nS), 
                                factor(findInterval(mat_hsv[3,], vBin), levels=1:nV)))
hsv_feature <- as.numeric(freq_hsv$Freq)/(ncol(mat)*nrow(mat)) # normalization

The number of bins for each model (\(n_H, n_S, n_V\)) is a tuning parameter for constructing color features. The total number of features is \(n_H n_S n_V\).

Spatial Color Features

To obtain the information of the color distributions at different sections of the image, we can divide the image into \(N\) vertical and \(M\) horizontal strips of equal width as the spatial histograms.The spatial histogram can be integrated with RGB-based color features or HSV-based color features. Note that, to perform spatial histogram, all images should be resized to a uniform size.

# Example with RGB-based color features
N <- 3 # number of bins in x-axis
M <- 5 # number of bins in y-axis
p_x <- p_y <- 250
img_s <- resize(img, p_x, p_y)
xbin <- floor(seq(0, p_x, length.out= N+1))
ybin <- seq(0, p_y, length.out=M+1)

construct_rgb_feature <- function(X){
  freq_rgb <- as.data.frame(table(factor(findInterval(X[,,1], rBin), levels=1:nR), 
                                factor(findInterval(X[,,2], gBin), levels=1:nG), 
                                factor(findInterval(X[,,3], bBin), levels=1:nB)))
  rgb_feature <- as.numeric(freq_rgb$Freq)/(ncol(X)*nrow(X))
  return(rgb_feature)
}

ff <- rep(NA, N*M*nR*nG*nB)
for(i in 1:N){
  for(j in 1:M){
    tmp <- img_s[(xbin[i]+1):xbin[i+1], (ybin[j]+1):ybin[j+1], ]
    ff[((M*(i-1)+j-1)*nR*nG*nB+1):((M*(i-1)+j)*nR*nG*nB)] <- construct_rgb_feature(tmp) 
  }
}
cat("length of spatial color feature =", length(ff), "\n")
## length of spatial color feature = 12000

The numbers of spatial bins \(N\) and \(M\) are also tuning parameters in addition to \(n_R, n_G, n_B\). The total number of features is then \(NMn_R n_G n_B\). The HSV-based color features can be constructed in similar fashion.

Visual Bag-of-Words

Coming soon in Advanced Image Analysis.

Cat & Dog Classification

This dataset contains 7349 images, including 2371 cat images and 4978 dog images. It has 37 breed category with roughly 200 images for each class. The images have a large variations in scale, pose and lighting. The goal is to build a classifier on features constructed from the images to differentiate cats and dogs.

Get the data: https://drive.google.com/a/columbia.edu/folderview?id=0Bxb_LWNRPA-uVFhTUmFWRk1zNk0&usp=sharing

!! Please login to your LionMail account to get the access and to download the data.

Determine class label

Extract class labels from the image names

dir_images <- "./catdog/images"
dir_names <- list.files(dir_images)

breed_name <- rep(NA, length(dir_names))
for(i in 1:length(dir_names)){
  tt <- unlist(strsplit(dir_names[i], "_"))
  tt <- tt[-length(tt)]
  breed_name[i] = paste(tt, collapse="_", sep="")
}
cat_breed <- c("Abyssinian", "Bengal", "Birman", "Bombay", "British_Shorthair", "Egyptian_Mau",
               "Maine_Coon", "Persian", "Ragdoll", "Russian_Blue", "Siamese", "Sphynx")

iscat <- breed_name %in% cat_breed
y_cat <- as.numeric(iscat)

Read attributes from XML files

Within the entire dataset, there are 3686 images provided with a tight bound box around the head of the animal.
The corner locations of the bounded box of the detected face in an image is stored in an XML file in the annotation folder. To extract the information, we can read attributes from XML files in R.

XML is a file extension for an Extensible Markup Language (XML) file format used to create common information formats and share both the format and the data on the web. XML describes the content in terms of what data is being described. Unlike HTML, XML does not use predefined tags

library(XML)
ann <- xmlParse("sample_annotation.xml")
head_box <- xmlToDataFrame(ann, nodes=getNodeSet(ann,"//annotation/object/bndbox"))

Reference